- Hive is a Silicon Valley startup that’s best known for its AI-powered image recognition system, with customers including NASCAR.
- The dirty little secret of artificial intelligence is that it takes a lot of human labor to make it all work. But Hive embraces this.
- Hive pays 600,000 workers and counting to label photos, getting paid pennies in return. You won’t get rich, but as Guo says, it’s a simple “game” that makes you money – what other app on your phone can do that?
- The data gets put to use in training AI systems, at a scale that Guo says is unmatched.
One of the worst-kept secrets in Silicon Valley is that it takes a whole lot of human labor to make artificial intelligence…intelligent.
The best example: When Google’s reCAPTCHA pages ask you to identify street signs or storefronts in photos before you can log in, you’re proving you’re not a robot, sure. You’re also providing valuable, human insight into what a street sign looks like, which is extremely useful data when you’re trying to train a self-driving car, or a smart security camera. The whole concept was memorably lampooned in an episode last year of HBO’s “Silicon Valley.”
Enter Hive, a Silicon Valley-based startup that’s embracing this human element to provide AI-powered image recognition that’s “orders of magnitude better than Google,” as cofounder Kevin Guo puts it. Guo’s cofounder Dmitriy Karpman actually dropped out of a prestigious PhD program at Stanford University to make Hive a reality; Guo got his Masters degree there before entering the tech industry.
The secret of Hive, says Guo, is that it’s turned training an AI into a kind of game – one with real cash prizes. Over 600,000 people have signed up for Hive Work, a smartphone app and website, to help train its AI systems. Hive Work asks users to do things like categorize images (a photo of a car might fall under “automobile” and “transport”), or to transcribe a short snipped of audio, or, like Google reCAPTCHA, to identify all the birds in a photo.
The money isn't much, Guo acknowledges, but it adds up to "tens of dollars" pretty quickly, and it's easy enough that you can "play" from your phone while you're on your commute. And, hey, money is money.
"What's the alternative? Playing Candy Crush Saga and losing money," jokes Guo.
The collected human insight is used to train up AI systems for customers like NASCAR, which uses the Hive Data product to track how often and how long a corporate logo is displayed on screen during a race, which is information that advertisers love having, says Guo. Hive also has other AI products, including Hive Predict, an AI-powered tool for helping companies use their data to spot patterns.
An advantage over the heavyweights
Hive employs about 60 people, and has raised $30 million or so in venture capital from investors including PayPal cofounder Peter Thiel's Founders Fund. That makes Hive a veritable David versus Silicon Valley Goliaths like Google, which is considered the company at the bleeding edge of what's possible with AI.
Indeed, Guo says that if it just came down to sheer manpower, Hive wouldn't stand a chance. Google, Microsoft, Amazon, and others have enticed the leading minds in AI to their sides with hefty pay packages that are far beyond what almost any startup could afford.
"If I tried to fight Google with PhDs, I'd lose," says Guo.
In fact, in an indirect way, Hive couldn't exist without Google. The company's underlying technology is built on a custom version of TensorFlow, an open source artificial intelligence framework first developed by the search giant, and released to the community. This means that, in terms of raw technology, Hive is definitely behind Google.
Instead, he says that the community is a big part of Hive's edge. The 600,000 Hive workers who help label images are tested before they're allowed to contribute. The system automatically slips in a few "known" tasks to test how well they're paying attention and keep everyone honest. When a customer has a specific project, like recognizing a certain logo, it gets slipped out to that community in the form of a challenge, with payouts adjusted for difficulty.
And so, while Hive's underlying AI technology is actually built on a customized version of Google's free, open source TensorFlow framework, Guo boasts that his company has a tremendous trove of high-quality, human-vetted AI training data that not even the search giant can match. Hive takes the data, trains the AI, and provides the tools to its customers to see what it came up with. Guo says it's the easiest way for companies to get started with AI.
Beyond the data, Guo says that it's a big advantage that Hive has sunk millions into its own server and networking infrastructure. It means that Hive has total control over its technological stack, Guo says, as it continues to refine its technology.
But he also says that it's become a boon when talking to customers, too. As the major tech titans increasingly compete with even the largest, most established corporations, Guo says that its customers are finding that it would be "foolish" to trust a Google or an Amazon with the sensitive data that powers artificial intelligence.